Conjecture: Real-world events will tend to have multiple causes

November 11, 2024

When some real-world event happens, we have a tendency to seek out which variables (among many possible variables) were the cause. Knowing that information means you can influence whether or not that event happens in the future -- so it seems self-evident that we evolved a proclivity towards hunting for causal variables where we can find them. What's somewhat less self-evident is our preference for single causal variables instead of many [#]. There are many potential explanations as to why our explanations minimize the number of relevant variables, like: computational efficiency, ease of communication, ease of memory, information complexity constraints [#], etc. However, I'm not aware of any a priori reason to assume that the base rate for number of causal variables that explain some event should hover around 1^[1].

So, explanations of real-world events have some number of causal variables that we should expect to be relevant. In different domains, this number might vary; in some cases, it might be very low (i.e., "monogenic"^[2]). But my conjecture here is that many of the hard-to-predict domains we care about (which companies succeed, which nations go to war, what coalitions form in our social networks) have a number of relevant causal variables that is particularly high.

Simple example: a horse dies on a farm. Was it due to:

malnutrition
cold weather
a bad virus

We'll say a horse dies when it no longer has energy to sustain its bodily function, and all of these factors have an additive effect on the amount of energy in the horses body. So now we'll define a critical threshold of energy needed for the horse to not be dead. Let's call it, 0.2.

So now we randomly sample from the space of possible horse realities, with varying conditions.

	Malnutrition	Cold Weather	Virus
Horse 1	1	1	0
Horse 2	0	0	0
Horse 3	0	0	0
Horse 4	1	1	0
Horse 5	1	0	1
Horse 6	0	0	1
Horse 7	1	1	1
Horse 8	1	0	0

We'll say each condition (malnutrition, cold, virus) has an effect of -.33 on available energy. The horse dies when energy falls below 0.2. Now we get:

	Malnutrition	Cold Weather	Virus	Energy	Alive
Horse 1	1	1	0	0.33	1
Horse 2	0	0	0	1.00	1
Horse 3	0	0	0	1.00	1
Horse 4	1	1	0	0.33	1
Horse 5	1	0	1	0.33	1
Horse 6	0	0	1	0.66	1
Horse 7	1	1	1	0.01	0
Horse 8	1	0	0	0.66	1

So one farmer might say, 'horse 7 died from a virus', and they'd be partially correct. But another farmer would say, 'no, it was the cold', etc etc. Neither is considering the more complicated model that the event of "dead horse" will never present itself in our universe until a totality of relevant variables pushes past some qualitative threshold. Things get more complicated as the number of variables grow and the "thresholds" become fuzzy, but you get the idea.

It isn't particularly enlightening, or that big of a conceptual leap, to say that real-world explanations might be "polygenic" more often than not ^[3]. And, maybe it's more socially rational to have a unidimensional bias. Maybe it makes communication easier. Maybe it provides a social scaffolding where other people can aggregate our simple models into more complicated ones (that happen to be more accurate). Maybe it's a fundamental part of how society evolved to tackle complicated problems as groups. But even as we follow the ritual of reacting to a real-world event and publicly professing our predicted "monogenic" explanations, we should keep in the back of our minds the understanding that many real-world systems will only manifest particular events when a critical number of conditions are met.

We can appeal to Occam's razor as a useful heuristic that says we should favor simpler explanations among equally valid competitors, but that isn't by itself an empirical claim about the modal structure of real-world phenomena ↩︎
to borrow a term from biology ↩︎
though it isn't self-evident either ↩︎